学习分解的表示形式需要监督或引入特定模型设计和学习限制作为偏见。Infogan是一个流行的分离框架,通过最大化潜在表示及其相应生成的图像之间的相互信息来学习无监督的分解表示形式。通过引入辅助网络和潜在回归损失的培训来实现共同信息的最大化。在这篇简短的探索性论文中,我们研究了希尔伯特 - 史密特独立标准(HSIC)的使用,以近似潜在表示和图像之间的相互信息,称为HSIC-INFOGAN。直接优化HSIC损失可以避免需要额外的辅助网络。我们定性地比较了每个模型中的分离水平,提出了一种调整HSIC-INFOGAN超参数的策略,并讨论了HSIC-INFOGAN在医疗应用中的潜力。
translated by 谷歌翻译
由于成本限制,减少医学图像分割中密集注释的面具的需求很重要。在本文中,我们考虑仅通过使用图像级标签进行训练来推断脑病变的像素级预测的问题。通过利用生成扩散概率模型(DPM)的最新进展,我们综合了“如果不存在X病理学,患者将如何出现?”。观察到的患者状态与健康反事实之间的差异图像可用于推断病理位置。我们产生的反事实是对应于输入的最小变化,以使其转化为健康域。这需要在DPM中使用健康和不健康的数据进行培训。我们通过通过隐式指导以及注意力条件而不是使用分类器来操纵生成过程来改善以前的反事实DPM。代码可在https://github.com/vios-s/diff-scm上找到。
translated by 谷歌翻译
培训医学图像分割模型通常需要大量标记的数据。相比之下,人类可以迅速学会从医学(例如MRI和CT)图像中准确地识别出有限的指导性解剖学。这种识别能力可以很容易地推广到来自不同临床中心的新图像。这种快速且可普遍的学习能力主要是由于人脑中图像模式的组成结构所致,该图像模式在医学图像分割中较少纳入。在本文中,我们将人类解剖结构的组成成分(即模式)建模为可学习的von-mises-fisher(VMF)内核,它们对从不同领域(例如临床中心)收集的图像具有鲁棒性。图像特征可以分解为具有组成操作的组件(或由)组成的组件,即VMF可能性。 VMF的可能性证明了每个解剖部分在图像的每个位置的可能性。因此,可以根据VMF的可能性预测分割掩模。此外,使用重建模块,未标记的数据也可以通过重新组合重建输入图像来学习VMF内核和可能性。广泛的实验表明,所提出的VMFNET在两个基准上实现了改善的概括性能,尤其是在注释有限的情况下。代码可在以下网址公开获取:https://github.com/vios-s/vmfnet。
translated by 谷歌翻译
甚至在没有受限,监督的情况下,也提出了甚至在没有受限或有限的情况下学习普遍陈述的方法。使用适度数量的数据可以微调新的目标任务,或者直接在相应任务中实现显着性能的无奈域中使用的良好普遍表示。这种缓解数据和注释要求为计算机愿景和医疗保健的应用提供了诱人的前景。在本辅导纸上,我们激励了对解散的陈述,目前关键理论和详细的实际构建块和学习此类表示的标准的需求。我们讨论医学成像和计算机视觉中的应用,强调了在示例钥匙作品中进行的选择。我们通过呈现剩下的挑战和机会来结束。
translated by 谷歌翻译
从公共机器学习(ML)模型中泄漏数据是一个越来越重要的领域,因为ML的商业和政府应用可以利用多个数据源,可能包括用户和客户的敏感数据。我们对几个方面的当代进步进行了全面的调查,涵盖了非自愿数据泄漏,这对ML模型很自然,潜在的恶毒泄漏是由隐私攻击引起的,以及目前可用的防御机制。我们专注于推理时间泄漏,这是公开可用模型的最可能场景。我们首先在不同的数据,任务和模型体系结构的背景下讨论什么是泄漏。然后,我们提出了跨非自愿和恶意泄漏的分类法,可用的防御措施,然后进行当前可用的评估指标和应用。我们以杰出的挑战和开放性的问题结束,概述了一些有希望的未来研究方向。
translated by 谷歌翻译
随着AI系统表现出越来越强烈的预测性能,它们的采用已经在许多域中种植。然而,在刑事司法和医疗保健等高赌场域中,由于安全,道德和法律问题,往往是完全自动化的,但是完全手工方法可能是不准确和耗时的。因此,对研究界的兴趣日益增长,以增加人力决策。除了为此目的开发AI技术之外,人民AI决策的新兴领域必须采用实证方法,以形成对人类如何互动和与AI合作做出决定的基础知识。为了邀请和帮助结构研究努力了解理解和改善人为 - AI决策的研究,我们近期对本课题的实证人体研究的文献。我们总结了在三个重要方面的100多篇论文中的研究设计选择:(1)决定任务,(2)AI模型和AI援助要素,以及(3)评估指标。对于每个方面,我们总结了当前的趋势,讨论了现场当前做法中的差距,并列出了未来研究的建议。我们的调查强调了开发共同框架的需要考虑人类 - AI决策的设计和研究空间,因此研究人员可以在研究设计中进行严格的选择,研究界可以互相构建并产生更广泛的科学知识。我们还希望这项调查将成为HCI和AI社区的桥梁,共同努力,相互塑造人类决策的经验科学和计算技术。
translated by 谷歌翻译
Neural compression offers a domain-agnostic approach to creating codecs for lossy or lossless compression via deep generative models. For sequence compression, however, most deep sequence models have costs that scale with the sequence length rather than the sequence complexity. In this work, we instead treat data sequences as observations from an underlying continuous-time process and learn how to efficiently discretize while retaining information about the full sequence. As a consequence of decoupling sequential information from its temporal discretization, our approach allows for greater compression rates and smaller computational complexity. Moreover, the continuous-time approach naturally allows us to decode at different time intervals. We empirically verify our approach on multiple domains involving compression of video and motion capture sequences, showing that our approaches can automatically achieve reductions in bit rates by learning how to discretize.
translated by 谷歌翻译
Multi-class ensemble classification remains a popular focus of investigation within the research community. The popularization of cloud services has sped up their adoption due to the ease of deploying large-scale machine-learning models. It has also drawn the attention of the industrial sector because of its ability to identify common problems in production. However, there are challenges to conform an ensemble classifier, namely a proper selection and effective training of the pool of classifiers, the definition of a proper architecture for multi-class classification, and uncertainty quantification of the ensemble classifier. The robustness and effectiveness of the ensemble classifier lie in the selection of the pool of classifiers, as well as in the learning process. Hence, the selection and the training procedure of the pool of classifiers play a crucial role. An (ensemble) classifier learns to detect the classes that were used during the supervised training. However, when injecting data with unknown conditions, the trained classifier will intend to predict the classes learned during the training. To this end, the uncertainty of the individual and ensemble classifier could be used to assess the learning capability. We present a novel approach for novel detection using ensemble classification and evidence theory. A pool selection strategy is presented to build a solid ensemble classifier. We present an architecture for multi-class ensemble classification and an approach to quantify the uncertainty of the individual classifiers and the ensemble classifier. We use uncertainty for the anomaly detection approach. Finally, we use the benchmark Tennessee Eastman to perform experiments to test the ensemble classifier's prediction and anomaly detection capabilities.
translated by 谷歌翻译
Are extralinguistic signals such as image pixels crucial for inducing constituency grammars? While past work has shown substantial gains from multimodal cues, we investigate whether such gains persist in the presence of rich information from large language models (LLMs). We find that our approach, LLM-based C-PCFG (LC-PCFG), outperforms previous multi-modal methods on the task of unsupervised constituency parsing, achieving state-of-the-art performance on a variety of datasets. Moreover, LC-PCFG results in an over 50% reduction in parameter count, and speedups in training time of 1.7x for image-aided models and more than 5x for video-aided models, respectively. These results challenge the notion that extralinguistic signals such as image pixels are needed for unsupervised grammar induction, and point to the need for better text-only baselines in evaluating the need of multi-modality for the task.
translated by 谷歌翻译
Riemannian geometry provides powerful tools to explore the latent space of generative models while preserving the inherent structure of the data manifold. Lengths, energies and volume measures can be derived from a pullback metric, defined through the immersion that maps the latent space to the data space. With this in mind, most generative models are stochastic, and so is the pullback metric. Manipulating stochastic objects is strenuous in practice. In order to perform operations such as interpolations, or measuring the distance between data points, we need a deterministic approximation of the pullback metric. In this work, we are defining a new metric as the expected length derived from the stochastic pullback metric. We show this metric is Finslerian, and we compare it with the expected pullback metric. In high dimensions, we show that the metrics converge to each other at a rate of $\mathcal{O}\left(\frac{1}{D}\right)$.
translated by 谷歌翻译